Skip to content

feat(orm): add fuzzy search and relevance ordering (PostgreSQL)#2573

Open
docloulou wants to merge 4 commits intozenstackhq:devfrom
docloulou:fuzzysearch
Open

feat(orm): add fuzzy search and relevance ordering (PostgreSQL)#2573
docloulou wants to merge 4 commits intozenstackhq:devfrom
docloulou:fuzzysearch

Conversation

@docloulou
Copy link
Copy Markdown

@docloulou docloulou commented Apr 10, 2026

Summary

  • Add fuzzy and fuzzyContains filter operators for String fields in where clauses, using PostgreSQL's pg_trgm extension with unaccent for accent-insensitive trigram matching
  • Add _relevance ordering in orderBy to sort results by fuzzy similarity score, supporting single and multiple fields
  • MySQL and SQLite explicitly throw NotSupported errors for these operators

New API

// Fuzzy similarity match (pg_trgm %)
await client.flavor.findMany({
  where: { name: { fuzzy: 'creme' } }
});

// Fuzzy substring match (pg_trgm <%)
await client.flavor.findMany({
  where: { name: { fuzzyContains: 'choco' } }
});

// Relevance ordering
await client.flavor.findMany({
  where: { name: { fuzzy: 'creme' } },
  orderBy: { _relevance: { fields: ['name'], search: 'creme', sort: 'desc' } }
});

Prerequisites (PostgreSQL)

The user must enable the following extensions in their PostgreSQL database:

CREATE EXTENSION IF NOT EXISTS unaccent;
CREATE EXTENSION IF NOT EXISTS pg_trgm;

Files changed

File Changes
packages/orm/src/client/crud-types.ts fuzzy, fuzzyContains in StringFilter; RelevanceOrderBy type
packages/orm/src/client/constants.ts Fuzzy filter kind in FILTER_PROPERTY_TO_KIND
packages/orm/src/client/crud/dialects/base-dialect.ts Handle fuzzy, fuzzyContains, _relevance in filter/orderBy builders; 3 abstract methods
packages/orm/src/client/crud/dialects/postgresql.ts pg_trgm + unaccent implementation (%, <%, similarity(), GREATEST())
packages/orm/src/client/crud/dialects/mysql.ts NotSupported errors
packages/orm/src/client/crud/dialects/sqlite.ts NotSupported errors
packages/orm/src/client/zod/factory.ts Zod validation schemas for fuzzy, fuzzyContains, _relevance
tests/e2e/orm/schemas/basic/schema.zmodel Added Flavor model
tests/e2e/orm/client-api/fuzzy-search.test.ts 35 E2E tests

Implementation details

fuzzy filter

Uses PostgreSQL trigram similarity operator % with unaccent and lower for accent-insensitive, case-insensitive matching:

unaccent(lower("name")) % unaccent(lower('creme'))

fuzzyContains filter

Uses PostgreSQL word similarity operator <% to check if the search term is approximately contained as a substring:

unaccent(lower('choco')) <% unaccent(lower("name"))

_relevance ordering

Uses similarity() function for single fields, GREATEST() for multiple fields:

-- Single field
ORDER BY similarity(unaccent(lower("name")), unaccent(lower('creme'))) DESC

-- Multiple fields
ORDER BY GREATEST(
  similarity(unaccent(lower("name")), unaccent(lower('chocolate'))),
  similarity(unaccent(lower("description")), unaccent(lower('chocolate')))
) DESC

Test plan

  • Basic fuzzy search (English words with typos, transpositions, truncation)
  • Accent-insensitive fuzzy search (French words: crème, café, éclair, pâté)
  • Nullable field handling
  • Combined filters (fuzzy + contains, fuzzy + startsWith, AND/OR/NOT)
  • fuzzyContains (substring fuzzy matching)
  • _relevance ordering (single field, multiple fields, with pagination)
  • Mutations (updateMany, deleteMany with fuzzy/fuzzyContains)
  • GroupBy and count with fuzzy filters
TEST_DB_PROVIDER=postgresql pnpm vitest run tests/e2e/orm/client-api/fuzzy-search.test.ts

Documentation : zenstackhq/zenstack-docs#596

Summary by CodeRabbit

  • New Features

    • Fuzzy search (approximate, accent‑insensitive) with whole-term and substring modes; relevance-based ordering (_relevance) to rank results by similarity.
  • Database Support

    • PostgreSQL: full fuzzy/relevance support. MySQL and SQLite: fuzzy/relevance explicitly not supported.
  • Tests

    • Added comprehensive end-to-end tests covering fuzzy, fuzzyContains, relevance ordering, pagination, mutations, aggregations.
  • Documentation

    • OpenAPI/schema inputs updated to expose fuzzy operators and _relevance ordering.

… only)

- Introduced fuzzy search operators (`fuzzy`, `fuzzyContains`) in the ORM.
- Added `RelevanceOrderBy` type for sorting based on fuzzy search relevance.
- Implemented fuzzy search filters in PostgreSQL dialect.
- Added error handling for unsupported fuzzy search features in MySQL and SQLite dialects.
- Updated Zod schema factory to include fuzzy search fields.
- Created a new `Flavor` model in the schema for testing purposes.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

Adds fuzzy text-search operators and relevance-based ordering: types and Zod schemas updated, base dialect builder extended with abstract hooks, Postgres implements fuzzy/relevance, MySQL/SQLite explicitly reject the features, plus e2e Postgres tests and a test-only Flavor model.

Changes

Cohort / File(s) Summary
Filter Operator Mappings
packages/orm/src/client/constants.ts
Added fuzzy and fuzzyContains mapped to new 'Fuzzy' filter kind.
Types & OrderBy
packages/orm/src/client/crud-types.ts
Added exported RelevanceOrderBy type; allowed _relevance in orderBy; extended StringFilter operator types to include fuzzy operators.
Dialect Base / Query Builder
packages/orm/src/client/crud/dialects/base-dialect.ts
Detects _relevance to reject cursor pagination; buildStringFilter supports fuzzy/fuzzyContains; added abstract methods buildFuzzyFilter, buildFuzzyContainsFilter, buildRelevanceOrderBy.
Postgres Dialect
packages/orm/src/client/crud/dialects/postgresql.ts
Implements fuzzy filters using unaccent(lower(...)) with trigram ops and implements buildRelevanceOrderBy using similarity(...) / GREATEST(...) for multi-field relevance.
MySQL / SQLite Dialects
packages/orm/src/client/crud/dialects/mysql.ts, packages/orm/src/client/crud/dialects/sqlite.ts
Add overrides that throw provider not supported errors for fuzzy, fuzzyContains, and "_relevance" ordering.
Zod Validation
packages/orm/src/client/zod/factory.ts
Zod schemas accept fuzzy/fuzzyContains string operators; order-by schema gains optional strict _relevance object with fields, search, and sort.
Tests & Test Schema
tests/e2e/orm/client-api/fuzzy-search.test.ts, tests/e2e/orm/schemas/basic/schema.zmodel, tests/e2e/orm/schemas/basic/schema.ts, tests/e2e/orm/schemas/basic/models.ts, tests/e2e/orm/schemas/basic/input.ts
Added Postgres-only e2e fuzzy-search test suite (fixtures, many assertions, pagination/aggregation/mutation coverage) and a test-only Flavor model plus generated types.
OpenAPI Baseline
packages/server/test/openapi/baseline/rpc.baseline.yaml
Extended OpenAPI schemas: added fuzzy/fuzzyContains to string filters and added _relevance order-by object across multiple order-by schemas.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐇 I nibble letters, soft and neat,

trigrams hum beneath my feet,
accents fall and order climbs,
Postgres sings in fuzzy rhymes,
I hop, I test, I leave sweet treats.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main addition: fuzzy search and relevance ordering for PostgreSQL. It is concise, clear, and reflects the primary changes across multiple files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
packages/orm/src/client/crud/dialects/postgresql.ts (1)

561-590: Well-implemented PostgreSQL fuzzy search using pg_trgm.

The implementation correctly uses:

  • Trigram similarity operator (%) for fuzzy
  • Word similarity operator (<%) for fuzzyContains with proper operand ordering
  • GREATEST() aggregation for multi-field relevance scoring

The use of sql template tags is appropriate here as these are PostgreSQL-specific operators not available in Kysely's type-safe API. The sql template is Kysely's escape hatch mechanism.

Note: Extension dependencies (pg_trgm and unaccent) are already documented in the type definitions (crud-types.ts). Consider adding runtime error handling if extensions are missing, similar to the createNotSupportedError pattern used for MySQL/SQLite, to provide users with a clearer message instead of a generic PostgreSQL error.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/orm/src/client/crud/dialects/postgresql.ts` around lines 561 - 590,
Add runtime checks for the required PostgreSQL extensions and throw a clear
user-facing error if missing: implement an internal check (e.g.,
ensurePostgresExtensionsAvailable) that queries pg_extension for 'pg_trgm' and
'unaccent' and call it from the PostgreSQL dialect initialization or lazily
before using fuzzy features; update buildFuzzyFilter, buildFuzzyContainsFilter,
and buildRelevanceOrderBy to call this check (or ensure it's called beforehand)
and throw a createNotSupportedError-style error with a clear message and
remediation steps if either extension is absent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/orm/src/client/crud-types.ts`:
- Around line 912-930: Update the RelevanceOrderBy type and its JSDoc to match
runtime behavior: change the _relevance.fields type from plain array to a
NonEmptyArray<NonRelationFields<Schema, Model>> so an empty fields list is
rejected at the type level, and revise the comment for _relevance to indicate
that relevance uses PostgreSQL similarity() (and that MySQL is not supported /
throws NotSupported at runtime) so IntelliSense reflects actual provider
constraints; locate the RelevanceOrderBy type and the _relevance field
declaration to make these edits.

In `@packages/orm/src/client/crud/dialects/base-dialect.ts`:
- Around line 1110-1131: The _relevance branch adds complex ordering but cursor
pagination still assumes simple {field: 'asc'|'desc'} entries; update handling
so cursor with a _relevance order is either rejected early or supported: modify
the code path that constructs cursor filters (function buildCursorFilter) to
detect order entries where field === '_relevance' (created via
buildRelevanceOrderBy / buildFieldRef / negateSort) and generate a comparison
that first compares computed relevance (value.search against the same fields)
then applies a deterministic tie-breaker (e.g., primary key) in the same sort
direction, or alternatively throw a clear validation error when a cursor is
supplied alongside an _relevance order; ensure tests cover both rejection and
correct SQL generation if you implement support.

In `@packages/orm/src/client/zod/factory.ts`:
- Around line 1180-1192: The _relevance.fields enum is currently built from all
scalar fields (scalarFieldNames) which allows non-string types; change the
scalarFieldNames computation in the getModelFields/filter pipeline to include
only string-typed scalar fields (e.g., filter by the field metadata indicating
type === 'String' or equivalent in your field definition) so that
_relevance.fields contains only string fields, and keep the z.enum(...) usage
but fed from the new string-only scalarFieldNames; update the code around
getModelFields, scalarFieldNames, and the _relevance strictObject construction
to reflect this restriction.

---

Nitpick comments:
In `@packages/orm/src/client/crud/dialects/postgresql.ts`:
- Around line 561-590: Add runtime checks for the required PostgreSQL extensions
and throw a clear user-facing error if missing: implement an internal check
(e.g., ensurePostgresExtensionsAvailable) that queries pg_extension for
'pg_trgm' and 'unaccent' and call it from the PostgreSQL dialect initialization
or lazily before using fuzzy features; update buildFuzzyFilter,
buildFuzzyContainsFilter, and buildRelevanceOrderBy to call this check (or
ensure it's called beforehand) and throw a createNotSupportedError-style error
with a clear message and remediation steps if either extension is absent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5c89d679-7173-415f-83ce-5738308b98ee

📥 Commits

Reviewing files that changed from the base of the PR and between 39a0a28 and 62fc9d7.

📒 Files selected for processing (12)
  • packages/orm/src/client/constants.ts
  • packages/orm/src/client/crud-types.ts
  • packages/orm/src/client/crud/dialects/base-dialect.ts
  • packages/orm/src/client/crud/dialects/mysql.ts
  • packages/orm/src/client/crud/dialects/postgresql.ts
  • packages/orm/src/client/crud/dialects/sqlite.ts
  • packages/orm/src/client/zod/factory.ts
  • tests/e2e/orm/client-api/fuzzy-search.test.ts
  • tests/e2e/orm/schemas/basic/input.ts
  • tests/e2e/orm/schemas/basic/models.ts
  • tests/e2e/orm/schemas/basic/schema.ts
  • tests/e2e/orm/schemas/basic/schema.zmodel

Comment thread packages/orm/src/client/crud-types.ts
Comment thread packages/orm/src/client/crud/dialects/base-dialect.ts
Comment thread packages/orm/src/client/zod/factory.ts Outdated
- _relevance.fields restreint aux champs String dans le schéma Zod
- Rejet du cursor pagination combiné avec _relevance ordering
- Type RelevanceOrderBy restreint aux StringFields avec tuple non-vide
- JSDoc mis à jour pour refléter le support PostgreSQL uniquement
@ymc9
Copy link
Copy Markdown
Member

ymc9 commented Apr 14, 2026

Hi @docloulou , thanks for this amazing PR, very useful feature and well implemented!

I'm wondering if you're fine with delaying it to release v3.7 or 3.8. Asking this because, although not directly related, it's a bit odd to support fuzzy search but not regular full text search (a feature gap from Prisma). I hope to get FTS implemented, probably in 3.7, and we can have this feature either together or in a subsequent minor release. What do you think?

@docloulou
Copy link
Copy Markdown
Author

No problem for me. If the code in this PR looks solid to you, it can serve as a good template for adding the FTS feature. The main things left to handle would be adding the @@fulltext([xxxx]) annotations and the associated migrations: https://www.prisma.io/docs/v6/orm/prisma-client/queries/full-text-search#mysql-1

Note: one thing to watch out for - in this PR I’m using _relevance (as Prisma does for FTS) for the fuzzy search, so there could be a conflict.

@iPommes
Copy link
Copy Markdown

iPommes commented Apr 16, 2026

I'm really loving the idea of this! I would definitely use this a lot :D

@ymc9
Copy link
Copy Markdown
Member

ymc9 commented Apr 16, 2026

No problem for me. If the code in this PR looks solid to you, it can serve as a good template for adding the FTS feature. The main things left to handle would be adding the @@fulltext([xxxx]) annotations and the associated migrations: https://www.prisma.io/docs/v6/orm/prisma-client/queries/full-text-search#mysql-1

Note: one thing to watch out for - in this PR I’m using _relevance (as Prisma does for FTS) for the fuzzy search, so there could be a conflict.

Yes, it'll definitely help FTS implementation. Maybe we can share the _relevance field if conflict can very carefully avoided.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/orm/src/client/crud-types.ts`:
- Around line 906-922: The _relevance.fields type currently allows enum-backed
string literal unions because StringFields is based on MapModelFieldType
returning any string-like type; change the typing so only true free-form string
columns are allowed (i.e. require MapModelFieldType<Schema, Model, Key> to be
exactly the primitive string type, not a string-literal union or enum type) and
use that stricter alias in RelevanceOrderBy._relevance.fields; align this
compile-time guard with the runtime check implemented in the zod factory (the
string-only schema logic around factory.ts handling of String) by
creating/renaming StringFields to something like StringFieldsOnlyString and
swapping it into RelevanceOrderBy._relevance.fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4df1bb7c-a65b-44f5-8926-58fcd6c21dee

📥 Commits

Reviewing files that changed from the base of the PR and between de08aa4 and 512a02e.

📒 Files selected for processing (6)
  • packages/orm/src/client/crud-types.ts
  • packages/orm/src/client/crud/dialects/base-dialect.ts
  • packages/orm/src/client/crud/dialects/mysql.ts
  • packages/orm/src/client/crud/dialects/postgresql.ts
  • packages/orm/src/client/crud/dialects/sqlite.ts
  • packages/orm/src/client/zod/factory.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/orm/src/client/crud/dialects/postgresql.ts
  • packages/orm/src/client/zod/factory.ts

Comment on lines +906 to +922
type StringFields<Schema extends SchemaDef, Model extends GetModels<Schema>> = {
[Key in NonRelationFields<Schema, Model>]: MapModelFieldType<Schema, Model, Key> extends string | null
? Key
: never;
}[NonRelationFields<Schema, Model>];

export type RelevanceOrderBy<Schema extends SchemaDef, Model extends GetModels<Schema>> = {
/**
* Sorts by fuzzy search relevance using PostgreSQL `similarity()` from `pg_trgm`.
* Not supported on MySQL or SQLite (throws `NotSupported` at runtime).
* Cannot be combined with cursor-based pagination.
*/
_relevance?: {
/**
* String fields to compute relevance against (must be non-empty).
*/
fields: [StringFields<Schema, Model>, ...StringFields<Schema, Model>[]];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

_relevance.fields is currently typed too broadly.

StringFields is derived from MapModelFieldType, so enum scalars also match here because their TS shape is a string-literal union. That means _relevance.fields accepts enum columns at compile time, while the runtime schema in packages/orm/src/client/zod/factory.ts:1304-1316 only allows fields whose schema type is String.

Suggested fix
 type StringFields<Schema extends SchemaDef, Model extends GetModels<Schema>> = {
-    [Key in NonRelationFields<Schema, Model>]: MapModelFieldType<Schema, Model, Key> extends string | null
+    [Key in NonRelationFields<Schema, Model>]: FieldIsArray<Schema, Model, Key> extends true
+        ? never
+        : GetModelFieldType<Schema, Model, Key> extends 'String'
         ? Key
         : never;
 }[NonRelationFields<Schema, Model>];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/orm/src/client/crud-types.ts` around lines 906 - 922, The
_relevance.fields type currently allows enum-backed string literal unions
because StringFields is based on MapModelFieldType returning any string-like
type; change the typing so only true free-form string columns are allowed (i.e.
require MapModelFieldType<Schema, Model, Key> to be exactly the primitive string
type, not a string-literal union or enum type) and use that stricter alias in
RelevanceOrderBy._relevance.fields; align this compile-time guard with the
runtime check implemented in the zod factory (the string-only schema logic
around factory.ts handling of String) by creating/renaming StringFields to
something like StringFieldsOnlyString and swapping it into
RelevanceOrderBy._relevance.fields.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/server/test/openapi/baseline/rpc.baseline.yaml`:
- Around line 4326-4329: The OpenAPI generator is advertising PostgreSQL-only
search operators (fuzzy, fuzzyContains, and _relevance) unconditionally; update
the generator in packages/orm/src/client/zod/factory.ts so that the schema
properties for "fuzzy", "fuzzyContains" and "_relevance" are only emitted when
the datasource provider is Postgres (or when the required extension is enabled),
otherwise either omit them or add an explicit description/enum stating
"PostgreSQL-only / extension required"; locate the emission logic around the
existing provider-agnostic blocks (see the sections you flagged near lines
~1025-1035 and ~1304-1316) and add a provider check (or description branch)
before adding those properties to the OpenAPI schema.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 456d6bee-273c-4080-a6e3-e57912084b89

📥 Commits

Reviewing files that changed from the base of the PR and between 512a02e and 75e3d4d.

📒 Files selected for processing (1)
  • packages/server/test/openapi/baseline/rpc.baseline.yaml

Comment on lines +4326 to +4329
fuzzy:
type: string
fuzzyContains:
type: string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't advertise PostgreSQL-only search operators unconditionally.

These additions make fuzzy/fuzzyContains and _relevance part of the generated OpenAPI contract, but the implementation is PostgreSQL-only per the PR and throws NotSupported on MySQL/SQLite. The linked generator logic in packages/orm/src/client/zod/factory.ts:1025-1035 and packages/orm/src/client/zod/factory.ts:1304-1316 also appears provider-agnostic, so non-Postgres projects will expose operations that cannot succeed at runtime. Please gate these schema properties on the datasource provider, or at least emit explicit PostgreSQL-only / extension-required descriptions so generated clients are not misled.

Also applies to: 5141-5163

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server/test/openapi/baseline/rpc.baseline.yaml` around lines 4326 -
4329, The OpenAPI generator is advertising PostgreSQL-only search operators
(fuzzy, fuzzyContains, and _relevance) unconditionally; update the generator in
packages/orm/src/client/zod/factory.ts so that the schema properties for
"fuzzy", "fuzzyContains" and "_relevance" are only emitted when the datasource
provider is Postgres (or when the required extension is enabled), otherwise
either omit them or add an explicit description/enum stating "PostgreSQL-only /
extension required"; locate the emission logic around the existing
provider-agnostic blocks (see the sections you flagged near lines ~1025-1035 and
~1304-1316) and add a provider check (or description branch) before adding those
properties to the OpenAPI schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants